50 research outputs found

    Optimizing MAKWA on GPU and CPU

    Get PDF
    We present here optimized implementations of the MAKWA password hashing function on an AMD Radeon HD 7990 GPU, and compare its efficiency with an Intel i7 4770K CPU for systematic dictionary attacks. We find that the GPU seems to get more hashing done for a given budget, but not by a large amount (the GPU is less than twice as efficient as the CPU). Raising the MAKWA modulus size to 4096 bits, instead of the default 2048 bits, should restore the balance in favour of the CPU. We also find that power consumption, not hardware retail price, is likely to become the dominant factor for industrialized, long-term attacking efforts

    EcGFp5: a Specialized Elliptic Curve

    Get PDF
    We present here the design and implementation of ecGFp5, an elliptic curve meant for a specific compute model in which operations modulo a given 64-bit prime are especially efficient. This model is primarily intended for running operations in a virtual machine that produces and verifies zero-knowledge STARK proofs. We describe here the choice of a secure curve, amenable to safe cryptographic operations such as digital signatures, that maps to such models, while still providing reasonable performance on general purpose computers

    Paradoxical Compression with Verifiable Delay Functions

    Get PDF
    Lossless compression algorithms such as DEFLATE strive to reliably process arbitrary inputs, while achieving compressed sizes as low as possible for commonly encountered data inputs. It is well-known that it is mathematically impossible for a compression algorithm to simultaneously achieve non-trivial compression on some inputs (i.e. compress these inputs into strictly shorter outputs) and to never expand any other input (i.e. guaranteeing that all inputs will be compressed into an output which is no longer than the input); this is a direct application of the pigeonhole principle . Despite their mathematical impossibility, we show in this paper how to build such paradoxical compression and decompression algorithms, with the aid of some tools from cryptography, notably verifiable delay functions, and, of course, by slightly cheating

    Truncated EdDSA/ECDSA Signatures

    Get PDF
    This note presents some techniques to slightly reduce the size of EdDSA and ECDSA signatures without lowering their security or breaking compatibility with existing signers, at the cost of an increase in signature verification time; verifying a 64-byte Ed25519 signature truncated to 60 bytes has an average cost of 4.1 million cycles on 64-bit x86 (i.e. about 35 times the cost of verifying a normal, untruncated signature)

    More Efficient Algorithms for the NTRU Key Generation using the Field Norm

    Get PDF
    NTRU lattices are a class of polynomial rings which allow for compact and efficient representations of the lattice basis, thereby offering very good performance characteristics for the asymmetric algorithms that use them. Signature algorithms based on NTRU lattices have fast signature generation and verification, and relatively small signatures, public keys and private keys. A few lattice-based cryptographic schemes entail, generally during the key generation, solving the NTRU equation: fGβˆ’gF=qmod  xn+1 f G - g F = q \mod x^n + 1 Here ff and gg are fixed, the goal is to compute solutions FF and GG to the equation, and all the polynomials are in Z[x]/(xn+1)\mathbb{Z}[x]/(x^n + 1). The existing methods for solving this equation are quite cumbersome: their time and space complexities are at least cubic and quadratic in the dimension nn, and for typical parameters they therefore require several megabytes of RAM and take more than a second on a typical laptop, precluding onboard key generation in embedded systems such as smart cards. In this work, we present two new algorithms for solving the NTRU equation. Both algorithms make a repeated use of the field norm in tower of fields; it allows them to be faster and more compact than existing algorithms by factors O~(n)\tilde O(n). For lattice-based schemes considered in practice, this reduces both the computation time and RAM usage by factors at least 100, making key pair generation within range of smart card abilities

    Improved Key Pair Generation for Falcon, BAT and Hawk

    Get PDF
    In this short note, we describe a few implementation techniques that allow performing key pair generation for the Falcon and Hawk lattice-based signature schemes, and for the BAT key encapsulation scheme, in a fully constant-time way and without any use of floating-point operations. Our new code is faster than previously published implementations, especially when running on small embedded systems, and uses less RAM

    Optimized Binary GCD for Modular Inversion

    Get PDF
    In this short note, we describe a practical optimization of the well-known extended binary GCD algorithm, for the purpose of computing modular inverses. The method is conceptually simple and is applicable to all odd moduli (including non-prime moduli). When implemented for inversion in the field of integers modulo the prime 2255βˆ’192^{255}-19, on a recent x86 CPU (Coffee Lake core), we compute the inverse in 6253 cycles, with a fully constant-time implementation

    Efficient and Complete Formulas for Binary Curves

    Get PDF
    Binary elliptic curves are elliptic curves defined over finite fields of characteristic 2. On software platforms that offer carryless multiplication opcodes (e.g. pclmul on x86), they have very good performance. However, they suffer from some drawbacks, in particular that non-supersingular binary curves have an even order, and that most known formulas for point operations have exceptional cases that are detrimental to safe implementation. In this paper, we show how to make a prime order group abstraction out of standard binary curves. We describe a new canonical compression scheme that yields a canonical and compact encoding. We also describe complete formulas for operations on the group. The formulas have no exceptional case, and are furthermore faster than previously known complete and incomplete formulas (general point addition in cost 8M+2S+2mb on all curves, 7M+2S+2mb on half of the curves). We also show how the same formulas can be applied to computations on the entire original curve, if full backward compatibility with standard curves is needed. Finally, we implemented our method over the standard NIST curves B-233 and K-233. Our strictly constant-time code achieves generic point multiplication by a scalar on curve K-233 in as little as 29600 clock cycles on an Intel x86 CPU (Coffee Lake core)

    Double-Odd Jacobi Quartic

    Get PDF
    Double-odd curves are curves with order equal to 2 modulo 4. A prime order group with complete formulas and a canonical encoding/decoding process could previously be built over a double-odd curve. In this paper, we reformulate such curves as a specific case of the Jacobi quartic. This allows using slightly faster formulas for point operations, as well as defining a more efficient encoding format, so that decoding and encoding have the same cost as classic point compression (decoding is one square root, encoding is one inversion). We define the prime-order groups jq255e and jq255s as the application of that modified encoding to the do255e and do255s groups. We furthermore define an optimized signature mechanism on these groups, that offers shorter signatures (48 bytes instead of the usual 64 bytes, for 128-bit security) and makes signature verification faster (down to less than 83000 cycles on an Intel x86 Coffee Lake core)

    Optimized Lattice Basis Reduction In Dimension 2, and Fast Schnorr and EdDSA Signature Verification

    Get PDF
    We present an optimization of Lagrange\u27s algorithm for lattice basis reduction in dimension 2. The optimized algorithm is proven to be correct and to always terminate with quadratic complexity; it uses more iterations on average than Lagrange\u27s algorithm, but each iteration is much simpler to implement, and faster. The achieved speed is such that it makes application of the speed-up on ECDSA and EC Schnorr signatures described by Antipa et al worthwhile, even for very fast curves such as Ed25519. We applied this technique to signature verification in Curve9767, and reduced verification time by 30 to 33% on both small (ARM Cortex M0+ and M4) and large (Intel Coffee Lake with AVX2) architectures
    corecore